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DETAILED ACTION 

Response to Arguments 

Applicant's arguments, see Renarl<s, filed 07/13/2009, with respect to the 
rejection(s) of claim(s) 1, 6, and 11 under 35 USC 103(a) have been fully considered 
and are persuasive. Therefore, the rejection has been withdrawn. However, upon 
further consideration, a new ground(s) of rejection is made in view of Papineni et al US 
5991710 A (hereinafter Papineni). Re "calculating a probability that the phrase is 
mapped to a semantic tag from a list of unordered semantic tags", Papineni explicitly 
teaches the identification of a sequences of phrases/sentences containing a words, 
wherein unordered words are identified and linked/mapped to a phrase (Papineni Col. 5 
lines 45 - Col. 6 line 50). Further, Papineni teaches the tagging of high priority words 
within unordered sets of words (Papineni Col. 3 line 66 - Col. 4 line 1 1 ). The semantic 
analysis interpreter and probability expectation maximization algorithm taught by Brill in 
view of Schabes render obvious the combination of Papineni to allow for the 
identification of all words found within a set of words regardless of order/sequence of 
words in a phrase or group of words (Col. 5 lines 45 - Col. 6 line 50). 

Further, the teachings of expectation maximization algorithm applied to semantic 
alanysis are consistent with the present invention's expectation maximization algorithm 
applied to tag sequences (k) (present invention spec, page 10). Additionally, Papineni 
teaches tag sequences like the present invention (present invention spec, page 10) also 
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consistent with a sequence of words/phrases tagged on an unordered basis (Col. 5 
lines 45 -Col. 6 line 50). 



Claim Rejections - 35 USC § 103 

1 . Tlie following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary sl<ill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 1 and 4-15 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Brill et al. US 20020169596 A1 (hereinafter Brill) in view of Schabes et al. US 
5537317 A (hereinafter Schabes). 

Re claim 1, Brill teaches a method carried out by a processor, comprising: 

extracting a phrase from a training corpus ([0021], semantic interpreter analyzing 
sentences from a corpus); 

calculating a probability that the phrase is mapped to a semantic tag ([0025], 
semantic interpreter mapping components) from a list of unordered semantic tags; 

mapping the phrase to the semantic tag ([0033-0034], highest score for learning 
set) with the highest mapping probability ([0028] maximization algorithm); 

generating a mapping table containing the phrase and its corresponding 
semantic tag ([0025], semantic interpreter mapping components) 
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However Brill fails to teach calculating a probability that the phrase is mapped to 
a semantic tag from a list of unordered semantic tags 

Schabes teaches past limitations and an improvement upon them, wherein 
Schabes teaches that In the past, in order to ascertain proper usage, the grammaticality 
of a sentence was computed as the probability of this sentence to occur in English. 
Such statistical approach assigns high probability to grammatically correct sentences, 
and low probability to ungrammatical sentences. The statistical is obtained by training 
on a collection of English sentences, or a training corpus. The corpus defines correct 
usage. As a result, when a sentence is typed in to such a grammar checking system, 
the probability of the entire sentence correlating with the corpus is computed. It will be 
appreciated in order to entertain the entire English vocabulary, about 60,000 words, a 
corpus of at several hundred trillion words must be used. Furthermore, a comparable 
number of probabilities must be stored on the computer. Thus the task of analyzing 
entire sentences is both computationally and storage intensive. In order to establish 
correct usage in the Subject System, it is the probability of a sequence of parts of 
speech which Is derived. For this purpose, one can consider that there are between 
100 and 400 possible parts of speech depending how sophisticated the system Is to be. 
This translates to a several million word training corpus as opposed to several hundred 
trillion. This type of analysis can be easily performed on standard computing platforms 
including the ones used for word processing. Thus in the subject system, a sentence is 
first broken up into parts of speech. For instance, the sentence "I heard this band play" 
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is analyzed as follows: PRONOUN, VERB, DETERMINER, NOUN, VERB. The 
probability of this part of speech sequence, is determined by comparing the sequence to 
the corpus. This is also not feasible unless one merely consider the so-called tri-grams. 
Tri-grams are triple of parts of speech which are adjacent in the input sentence. 
Analyzing three adjacent parts of speech is usually sufficient to establish correctness; 
and it the probability of these tri-grams which is utilized to establish that a particular 
sentence involves correct usage. Thus rather than checking the entire sentence, the 
probability of three adjacent parts of speech is computed from the training corpus 
(Schabes Col. 8 lines 13-51). 

Further, Schabes teaches that the entries of a dictionary are selected and ranked 
based on the part of speech assigned to the given word in context. The entries 
corresponding to the word in context are first selected. The other entries not relevant to 
the current context are still available at the request of the user. The part of speech of 
the given word in context is disambiguated with the part of speech tagger described 
above. By way of illustration, assuming the word "left" in the sentence "He left a minute 
ago", the part of speech tagger assigns the tag "verb past tense" for the word "left" in 
that sentence. For this case, the Subject System selects the entries for the verb "leave" 
corresponding to the usage of "left" in that context and then selects the entries for "left" 
not used in that context, in particular the ones for "left" as an adjective, as an adverb 
and as a noun (Schabes Col. 24 lines 45-60). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Brill to incorporate calculating a probability 
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that the phrase is mapped to a semantic tag from a list of semantic tags as taught by 
Schabes to allow for the tagging of semantic portions of a sentence (such as parts of 
speech) in order to prioritize (i.e. the best ranking/probability) semantic tags within a 
sentence to maintain the proper context based on adjacent tags in a sentence (Schabes 
Col. 24 lines 45-60). 

However, Brill in view of Schabes fails to teach the use of semantic unordered 

lists. 

Papineni teaches the Identification of word mapping relative to an unordered list 
of grammatical components, wherein word-set feature functions formed and supported 
by the translation model of the present invention are characterized such that s and t are 
unordered sets of words. That is, s is in S if all n words of s are in S, regardless of the 
order In which they occur In S. Likewise, t is in T if all n words of t are In T, regardless of 
the order In which they occur in T. An example of a word-set feature function or 
operation performed by the model in the ATIS domain would be searching for the 
existence of the unordered words "departing" and "after" among the formal sentence 
candidates (stored In target language candidate store 30), given an English sentence 
having the unordered words "leave" and "after" contained therein. For Instance given 
the sample English sentences (E.sub.1 through E.sub.6) and the sample formal 
sentences (F.sub.1 through F.sub.5) above, the word-set feature function fires on 
E.sub.1 and F.sub.1 , thus, identifying the pair (E.sub.1 , F.sub.1 ). The same is true for 
the pair (E.sub.2, F.sub.1) (Col. 5 lines 45 - Col. 6 line 50). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Brill in view of Schabes to incorporate 
calculating a probability that the phrase is mapped to a semantic tag from a list of 
unordered semantic tags as taught by Papineni to allow for the identification of all words 
found within a set of words regardless of order/sequence of words in a phrase or group 
of words Col. 5 lines 45 - Col. 6 line 50). 

Re claims 6, and 1 1 , Brill teaches a processor executing a computer program 

product to: 

calculate a mapping probability that a semantic tag of a set of candidate 
semantic tags is assigned to a phrase ([0025]), wherein the calculation of the mapping 
probability is performed by means of a statistical procedure based on a set of phrases 
constituting a corpus of sentences ([0024]), each of the phrases having assigned a set 
of candidate semantic tags ([0028]). 

generate a mapping table from the performed mapping ([0035]) 
However, Brill fails to teach mapping probability that is performed by means of a 
statistical procedure based on a set of phrases 

However Brill falls to teach calculating a probability from a list of semantic tags 
Schabes teaches past limitations and an improvement upon them, wherein 
Schabes teaches that in the past, in order to ascertain proper usage, the grammaticality 
of a sentence was computed as the probability of this sentence to occur in English. 
Such statistical approach assigns high probability to grammatically correct sentences. 
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and low probability to ungrammatical sentences. The statistical is obtained by training 
on a collection of English sentences, or a training corpus. The corpus defines correct 
usage. As a result, when a sentence is typed in to such a grammar checking system, 
the probability of the entire sentence correlating with the corpus is computed. It will be 
appreciated in order to entertain the entire English vocabulary, about 60,000 words, a 
corpus of at several hundred trillion words must be used. Furthermore, a comparable 
number of probabilities must be stored on the computer. Thus the task of analyzing 
entire sentences is both computationally and storage intensive. In order to establish 
correct usage in the Subject System, it is the probability of a sequence of parts of 
speech which is derived. For this purpose, one can consider that there are between 
100 and 400 possible parts of speech depending how sophisticated the system is to be. 
This translates to a several million word training corpus as opposed to several hundred 
trillion. This type of analysis can be easily performed on standard computing platforms 
including the ones used for word processing. Thus in the subject system, a sentence is 
first broken up into parts of speech. For instance, the sentence "I heard this band play" 
is analyzed as follows: PRONOUN, VERB, DETERMINER, NOUN, VERB. The 
probability of this part of speech sequence, is determined by comparing the sequence to 
the corpus. This is also not feasible unless one merely consider the so-called tri-grams. 
Tri-grams are triple of parts of speech which are adjacent in the input sentence. 
Analyzing three adjacent parts of speech is usually sufficient to establish correctness; 
and it the probability of these tri-grams which is utilized to establish that a particular 
sentence involves correct usage. Thus rather than checking the entire sentence, the 
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probability of three adjacent parts of speech is computed from the training corpus 
(Schabes Col. 8 lines 13-51). 

Further, Schabes teaches that the entries of a dictionary are selected and ranked 
based on the part of speech assigned to the given word in context. The entries 
corresponding to the word In context are first selected. The other entries not relevant to 
the current context are still available at the request of the user. The part of speech of 
the given word in context is disambiguated with the part of speech tagger described 
above. By way of illustration, assuming the word "left" in the sentence "He left a minute 
ago", the part of speech tagger assigns the tag "verb past tense" for the word "left" In 
that sentence. For this case, the Subject System selects the entries for the verb "leave" 
corresponding to the usage of "left" in that context and then selects the entries for "left" 
not used in that context, in particular the ones for "left" as an adjective, as an adverb 
and as a noun (Schabes Col. 24 lines 45-60). 

Schabes also teaches well known previous techniques, wherein in the past, In 
order to ascertain proper usage, the grammaticality of a sentence was computed as the 
probability of this sentence to occur in English. Such statistical approach assigns high 
probability to grammatically correct sentences, and low probability to ungrammatical 
sentences. The statistical Is obtained by training on a collection of English sentences, 
or a training corpus. The corpus defines correct usage. As a result, when a sentence Is 
typed in to such a grammar checking system, the probability of the entire sentence 
correlating with the corpus is computed. It will be appreciated in order to entertain the 
entire English vocabulary, about 60,000 words, a corpus of at several hundred trillion 
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words must be used. Furthermore, a comparable number of probabilities must be 
stored on the computer. Thus the task of analyzing entire sentences is both 
computationally and storage intensive (Schabes Col. 8 lines 12-28). 

Further, Schabes overcomes previous techniques, wherein rather than 
comparing the above mentioned probabilities, in a preferred embodiment, the subject 
system compares the geometric average of these probabilities by taking Into account 
their word lengths, i.e. by comparing the logarithm of PI divided by the number of words 
in SI , and the logarithm of P2 divided by the number of words in S2. This is important 
In cases where a single word may be confused with a sequence of words such as 
"maybe" and "may be". Directly comparing the probabilities of the part of speech 
sequences would favor shorter sentences instead of longer sentences, an not 
necessarily correct result, since the statistical language model assigns lower 
probabilities to longer sentences (Schabes Col. 9 lines 55-67). 

Therefore, it would have been obvious to one of ordinary skill In the art at the 
time of the invention to modify the system of Brill to incorporate mapping probability that 
is performed by means of a statistical procedure based on a set of phrases and 
semantic tags assigned to a phrase as taught by Schabes to allow for the recognition of 
parts of speech and Individual In addition to the identification of sentences/phrases, 
wherein higher/lower probabilities are assigned to sentences and the length of the 
sentences in an unsupervised or even supervised system (Schabes Col. 9 lines 55-67) 
and to further allow for the tagging of semantic portions of a sentence (such as parts of 
speech) in order to prioritize (i.e. the best ranking/probability) semantic tags within a 
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sentence to maintain the proper context based on adjacent tags in a sentence (Sciiabes 
Col. 24 lines 45-60). 

However, Brill in view of Schabes fails to teach the use of semantic unordered 

lists. 

Papineni teaches the identification of word mapping relative to an unordered list 
of grammatical components, wherein word-set feature functions formed and supported 
by the translation model of the present invention are characterized such that s and t are 
unordered sets of words. That is, s is in S if all n words of s are in S, regardless of the 
order in which they occur in S. Likewise, t is in T if all n words of t are in T, regardless of 
the order in which they occur in T. An example of a word-set feature function or 
operation performed by the model in the ATIS domain would be searching for the 
existence of the unordered words "departing" and "after" among the formal sentence 
candidates (stored in target language candidate store 30), given an English sentence 
having the unordered words "leave" and "after" contained therein. For instance given 
the sample English sentences (E.sub.1 through E.sub.6) and the sample formal 
sentences (F.sub.1 through F.sub.5) above, the word-set feature function fires on 
E.sub.1 and F.sub.1 , thus, identifying the pair (E.sub.1 , F.sub.1 ). The same is true for 
the pair (E.sub.2, F.sub.1) (Col. 5 lines 45 - Col. 6 line 50). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Brill in view of Schabes to incorporate 
calculating a probability that the phrase is mapped to a semantic tag from a list of 
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unordered sennantic tags as taught by Papineni to allow for the identification of all words 
found within a set of words regardless of order/sequence of words in a phrase or group 
of words Col. 5 lines 45 - Col. 6 line 50). 



Re claims 7 and 12, Brill teaches the method according to claim I, for each 
phrase further comprising calculating a set of mapping probabilities ([0025]), providing 
the probability for each semantic tag of the set of candidate semantic tags being 
assigned to the phrase ([0028]). 

However, Brill fails to teach providing the probability for each semantic tag of the 
set of candidate semantic tags 

Schabes teaches well known previous techniques, wherein in the past, in order 
to ascertain proper usage, the grammatlcality of a sentence was computed as the 
probability of this sentence to occur in English. Such statistical approach assigns high 
probability to grammatically correct sentences, and low probability to ungrammatical 
sentences. The statistical is obtained by training on a collection of English sentences, 
or a training corpus. The corpus defines correct usage. As a result, when a sentence is 
typed in to such a grammar checking system, the probability of the entire sentence 
correlating with the corpus is computed. It will be appreciated in order to entertain the 
entire English vocabulary, about 60,000 words, a corpus of at several hundred trillion 
words must be used. Furthermore, a comparable number of probabilities must be 
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stored on the computer. Thus the task of analyzing entire sentences is both 
computationally and storage intensive (Schabes Col. 8 lines 12-28). 

Further, Schabes overcomes previous techniques, wherein rather than 
comparing the above mentioned probabilities, in a preferred embodiment, the subject 
system compares the geometric average of these probabilities by taking into account 
their word lengths. I.e. by comparing the logarithm of PI divided by the number of words 
in SI , and the logarithm of P2 divided by the number of words in S2. This is important 
in cases where a single word may be confused with a sequence of words such as 
"maybe" and "may be". Directly comparing the probabilities of the part of speech 
sequences would favor shorter sentences instead of longer sentences, an not 
necessarily correct result, since the statistical language model assigns lower 
probabilities to longer sentences (Schabes Col. 9 lines 55-67). 

Therefore, It would have been obvious to one of ordinary skill In the art at the 
time of the Invention to modify the system of Brill to incorporate the probability for each 
semantic tag of the set of candidate semantic tags as taught by Schabes to allow for the 
recognition of parts of speech and individual in addition to the identification of 
sentences/phrases, wherein higher/lower probabilities are assigned to sentences and 
the length of the sentences in an unsupervised or even supervised system (Schabes 
Col. 9 lines 55-67). 



Re claims 8 and 13, Brill teaches the method according to claim 2, further 
comprising determining one semantic tag of the set of candidate semantic tags ([0025]) 



Application/Control Number: 10/578,640 Page 14 

Art Unit: 2626 

having the highest mapping probability of the set of mapping probabilities and mapping 
the one semantic tag to the phrase ([0024]) 

However, Brill fails to teach determining one semantic tag of the set of candidate 
semantic tags having the highest mapping probability 

Schabes teaches well known previous techniques, wherein in the past, in order 
to ascertain proper usage, the grammaticality of a sentence was computed as the 
probability of this sentence to occur in English. Such statistical approach assigns high 
probability to grammatically correct sentences, and low probability to ungrammatical 
sentences. The statistical is obtained by training on a collection of English sentences, 
or a training corpus. The corpus defines correct usage. As a result, when a sentence is 
typed in to such a grammar checking system, the probability of the entire sentence 
correlating with the corpus is computed. It will be appreciated in order to entertain the 
entire English vocabulary, about 60,000 words, a corpus of at several hundred trillion 
words must be used. Furthermore, a comparable number of probabilities must be 
stored on the computer. Thus the task of analyzing entire sentences is both 
computationally and storage intensive (Schabes Col. 8 lines 12-28). 

Further, Schabes overcomes previous techniques, wherein rather than 
comparing the above mentioned probabilities, in a preferred embodiment, the subject 
system compares the geometric average of these probabilities by taking into account 
their word lengths, i.e. by comparing the logarithm of PI divided by the number of words 
in SI , and the logarithm of P2 divided by the number of words in S2. This is important 
in cases where a single word may be confused with a sequence of words such as 
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"maybe" and "may be". Directly comparing the probabilities of the part of speech 
sequences would favor shorter sentences instead of longer sentences, an not 
necessarily correct result, since the statistical language model assigns lower 
probabilities to longer sentences (Schabes Col. 9 lines 55-67). 

Therefore, it would have been obvious to one of ordinary skill In the art at the 
time of the invention to modify the system of Brill to incorporate the probability for each 
semantic tag of the set of candidate semantic tags as taught by Schabes to allow for the 
recognition of parts of speech and individual in addition to the identification of 
sentences/phrases, wherein higher/lower probabilities are assigned to sentences and 
the length of the sentences in an unsupervised or even supervised system (Schabes 
Col. 9 lines 55-67). 

Re claims 4, 9, and 14, Brill teaches the method according to claim 1 , wherein 
the statistical procedure comprises an expectation maximization algorithm ([0028]). 

Re claims 5, 10, and 15, Brill teaches the method according to claim 3 or 4, 
further comprising storing of performed mappings between a candidate semantic tag 
([0025]) and a phrase In form of a mapping table ([0024]) In order to derive a grammar 
being applicable to unknown sentences or unknown phrases. 

However, Brill fails to teach deriving a grammar being applicable to unknown 
sentences or unknown phrases 
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Schabes teaches well known previous techniques, wherein in the past, in order 
to ascertain proper usage, the grammatical ity of a sentence was computed as the 
probability of this sentence to occur in English. Such statistical approach assigns high 
probability to grammatically correct sentences, and low probability to ungrammatical 
sentences. The statistical is obtained by training on a collection of English sentences, 
or a training corpus. The corpus defines correct usage. As a result, when a sentence is 
typed in to such a grammar checking system, the probability of the entire sentence 
correlating with the corpus is computed. It will be appreciated in order to entertain the 
entire English vocabulary, about 60,000 words, a corpus of at several hundred trillion 
words must be used. Furthermore, a comparable number of probabilities must be 
stored on the computer. Thus the task of analyzing entire sentences is both 
computationally and storage intensive (Schabes Col. 8 lines 12-28). 

Further, Schabes overcomes previous techniques, wherein rather than 
comparing the above mentioned probabilities, in a preferred embodiment, the subject 
system compares the geometric average of these probabilities by taking into account 
their word lengths, i.e. by comparing the logarithm of PI divided by the number of words 
in S1 , and the logarithm of P2 divided by the number of words in S2. This is important 
in cases where a single word may be confused with a sequence of words such as 
"maybe" and "may be". Directly comparing the probabilities of the part of speech 
sequences would favor shorter sentences instead of longer sentences, an not 
necessarily correct result, since the statistical language model assigns lower 
probabilities to longer sentences (Schabes Col. 9 lines 55-67). 
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Furthermore, Schabes teaches that in particular importance in grammar checl<ing 
is the ability to detect the sequence of parts of speech as they exist in a given sentence. 
Correct sentences will have parts of speech which follow a normal sequence, such that 
by analyzing the parts of speech sequence one can detect the probability that the 
sentence is correct in terms of its grammar. While prior art systems have tagged a 
sentence for parts of speech and have analyzed the sequences of parts of speech for 
the above mentioned probability, these probability have never been utilized in grammar 
checking and correcting system (Schabes Col. 3 lines 14-25 & Fig. 1). 

Therefore, it would have been obvious to one of ordinary sl<ill in the art at the 
time of the invention to modify the system of Brill to incorporate deriving a grammar 
being applicable to unknown sentences or unknown phrases as taught by Schabes to 
allow for the analysis of any input, particularly in any language and being able to not 
only translate but interpret the semantic and syntactic structure of discourse, wherein 
probabilities that check if grammar is correct based on a sequential sentence input 
(Schabes Col. 3 lines 14-25 & Fig. 1). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Michael C. Colucci whose telephone number is (571)- 
270-1847. The examiner can normally be reached on 9:30 am - 6:00 pm, Monday- 
Friday. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571 )-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retheval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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